SIKS course: Reinforcement Learning for Adaptive Hybrid Intelligence

Introduction

On October 20th and 21st 2025, the School for Information and Knowledge Systems (SIKS) will be organising a new two-day course on Reinforcement Learning for Adaptive Hybrid Intelligence.

Reinforcement learning (RL) is one of the main paradigms of machine learning, in which artificial agents learn optimal behaviour from interaction data. Recent years have seen notable breakthroughs in this field in robotics, games such as Atari and Go, and in ChatGPT. While the focus on autonomous and active learning makes reinforcement learning a powerful tool, deployment of a reinforcement learning agent as an assistant or collaborator (that is, as a hybrid intelligence) raises specific challenges.

In this two-day course we will take a look at the foundations of reinforcement learning and various extensions that are important for hybrid intelligence. These include:

“Safe reinforcement learning”: safety, ethical, or legal constraints can shape the reward function and eventually help the artificial agent in exploration vs. exploitation trade-off
“Causal reinforcement learning” will take on designing RL agents that can interact with the environment in both explaining the data generation process in the environment, and making better decisions provided that they are equipped with causal knowledge
“Learning from feedback” will focus on obtaining behavior or reward functions from data or interactions where reward functions are hard to specify manually
“RL in the context of LLMs” will focus on preference optimization, RLHF and the GRPO algorithm in order to adjust for human preferences, with a hands-on demo.
“Multi-agent RL (MARL)” will focus on the setting where multiple RL agents interact in a shared environment, requiring communication and coordination.
“Safe MARL”: While every agent has an individual safety guarantee, the overall multiagent system can still be risky due to coordination/cooperation issues. This component will address such challenges in addition to maintaining safe exploration, using formal constraints in multiagent environments.
“Multi-objective RL” will focus on scenarios where the agent should take into account multiple, potentially conflicting, criteria and goals.

Although the course is primarily intended for SIKS PhD students, other participants will not be excluded. However, their number of passes will be restricted and will depend on the number of SIKS PhD students participating in the course.

Course coordinators

Erman Acar (UvA)
Herke van Hoof (UvA)
Shihan Wang (UU)

Location

The course will take place at hotel Landgoed Huize Bergen in Vught, near Den Bosch.

Programme

Tentative schedule:

Day 1 (RL): Monday 20 October
9:15 – 9:30 doors open
9:30 – 9:45: Welcome and introduction
9:45 – 10:45 Advanced Basics I (Herke van Hoof)
10:45 – 11:15 coffee break
11:15 – 12:15 Advanced Basics II (Herke van Hoof)
12:15 – 13:30 lunch
13:30 – 14:15 Safe reinforcement learning ( Thiago D. Simão)
14:15 – 15:00 Causality and reinforcement learning (Sara Magliacane )
15:00 – 15:30 coffee break
15:30 – 16:15 Sharpie tutorial (Floris den Hengst & Libio Goncalves Braz)
16:15 – 17:00 Inverse RL and imitation learning in robotics (Jens Kober)
17:30 Dinner

Day 2 (MARL): Tuesday 21 October
9:30 – 10:30 Basics of MARL, including game theory – (Wendelin Böhmer)
10:30 – 11:00 Poster session I
11:00 – 11:30 coffee break
11:30 – 12:15 Communication in MARL (Shihan Wang)
12:15 – 13:30 lunch
13:30 – 14:30 Safe multi-agent RL (Erman Acar)
14:30 – 15:00 Poster session II
15:00 – 15:30 Coffee break
15:30 – 16:15 LLM +RL/RLHF (Karel D’Oosterlinck)
16:15 – 17:00 Multi-objective learning & preference elicitation – (Roxana Radulescu)
17:00 – 17:05 Closing

Registration

Registration is now closed. Please note that we are unable to book overnight stays for participants at this time. Should you wish to stay at Landgoed Huize bergen, please contact them directly.
If you have any questions about registering, as always, do not hesitate to contact Renée Otten.

Information for non-SIKS PhD students

SIKS needs a confirmation from your supervisor/office that they agree with the arrangement and paying conditions.